May 23, 2021
Materials at https://github.com/CEIDatUGA/ECOL8540-datavis
The image properties (visual variables) should match the data properties. TRANSLATION: Don’t lie (even by ommission).
Encode the most important information in the most effective way.
The most effective visualization conveys information in the most perceivable way. (Can be decoded the fastest and most accurately.)
Data is expressible in a visual language if the signs express all the facts and only the facts in the data.
Visual language cannot express all the facts in the data.
Visual language expresses facts not in the data.
Greatest number of ideas
in shortest time
with the least ink
in the smallest space
\[ \begin{align} \text{Data-ink Ratio } ~= ~&\frac{\text{data-ink}}{\text{total ink used to print the graphic}} \\\\ = ~&\text{proportion of a graphic's ink devoted to the} \\ ~&\text{non-redundant display of data-information} \\\\ = ~&1 - (\text{proportion of a graphic's ink that can be erased} \\ ~&\text{without loss of data-information}) \\ \end{align} \]
bad
better
best
original - redundant = “the good part”
http://datavis-sp16.github.io/lectures/color
Use the HCL colorspace (a perceptually uniform color space) (Hue, Saturation, Lightness)
Lightness is the most important and most accurate perceptual channel
{ggplot}, {plotly}, and many other R packages have good defaults.{colorspace} package lets you build HCL color palettescolorspace::choose_palette(){datacolor} package makes it easier to work with HCL, and also analyzes color palettesProblem: Color palette for a binned, continuous variable. (e.g. show absolute rainfall quantities AND “low,” medium” and “high” rainfall categories)
Palette from http://www.hclwizard.org/why-hcl/ recreated with datacolor R package
# install.packages(devtools) ## imports {remotes} package
# remotes::install_github("allopole/datacolor")
n <- 12 # palette length
why_hcl <- datacolor::hcl2hex(
L=100*datacolor::rampx(from = .95, to = .35, n, exponent = 1.65),
C=100*datacolor::stepx(from = .2, to = .77, n, step.n = n/4),
H=datacolor::stepx(from = 65,to = 320, n=n, step.n = n/4)
)
why_hcl
## [1] "#FCEFD9" "#F8ECD6" "#F1E5CF" "#A6EBC9" "#9ADFBD" "#8CD1AF" "#70BBEA" ## [8] "#5BAAD9" "#4298C6" "#C34FAC" "#AE3398" "#9A0084"
datacolor::colorbar(why_hcl)
datacolor::colorplot(why_hcl,colorblind=T)
Kelleher, C. and Wagener, T., 2011. Ten guidelines for effective data visualization in scientific publications. Environmental Modelling & Software, 26(6), pp.822-827.
Rougier, N.P., Droettboom, M. and Bourne, P.E., 2014. Ten simple rules for better figures. PLoS Comput Biol, 10(9), p.e1003833.
Gregor Aisch (former graphics editor, New York Times). Using Data Visualization to Find Insights in Data. DataJournalism.com
Hadley Wickham (2010) A Layered Grammar of Graphics. Journal of Computational and Graphical Statistics 19:3-28.
Tufte, Edward: